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Abstract — The efficiency of parameter estimation of quantum 
channels is studied in this paper. We introduce the concept of 
programmable parameters to the theory of estimation. It is found 
that programmable parameters obey the standard quantum limit 
strictly; hence no speedup is possible in its estimation. We 
also construct a class of non-unitary quantum channels whose 
parameter can be estimated in a way that the standard quantum 
limit is broken. The study of estimation of general quantum 
channels also enables an investigation of the effect of noises on 
quantum estimation. 

Index Terms — Heisenberg limit, Parameter estimation, Pro- 
grammable gates, Quantum channels, Standard quantum limit 



I. Introduction 

PARAMETER estimation, which is central to mathemati- 
cal statistics, is also an elementary problem in information 
theory. Its main objective is to construct and evaluate various 
methods that can estimate the values of parameters of either 
an information source or a communication channel. Unlike in 
the usual scenarios of information theory where the source 
and the channel are exactly known, we now have a source or 
a channel that depends on some unknown parameters. Taking 
the binary symmetric channel for example, we might know 
that the channel is indeed binary symmetric but does not have 
any information about the probability of it making a flip error. 
Thus, before we can make use of it in communication, we 
should better determine the error probability first. This is the 
most basic situation where parameter estimation takes place 
and we will see later that it also arises in other quite different 
applications. 

Historically, the research of this topic dates back to the 
origin of mathematical statistics, though the concept of "a 
family of distributions with parameters" did not emerge until 
the 20's of the last century [1]. In the development, statisticians 
have established different methods to make inferences about 
parameters: maximum likelihood estimators, Bayes estima- 
tors, method of moments estimators, etc. (see, for exam- 
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pie, [2] for detailed discussions). At the same time, an impor- 
tant inequality — the Cramer-Rao inequality — was discovered 
which sets lower bounds on the variance of any estimator in 
terms of Fisher information [3]-[5]. Fisher further showed that 
maximum likelihood estimators can achieve the lower bound 
asymptotically [4], [6] and made the Cramer-Rao inequality 
essential to estimation theory. These results from statistics 
have already been applied to various problems in information 
theory. 

As quantum mechanics provides us with a more precise 
model of describing reality, it is necessary to study estimation 
theory directly based on quantum mechanics instead of the 
empirical models in statistics. Helstrom [7], [8] and Holevo [9] 
pioneered the study of estimation theory in the quantum 
setting. The quantum version of Cramer-Rao inequality was 
established in [7]— [1 1]. It was shown by Braunstein, Caves 
and Milburn that the lower bound, the reciprocal of quantum 
Fisher information, is also achievable asymptotically [11]. 
This inequality has fundamental implications in physics. It is 
closely related to skew information proposed by Wigner and 
Yanase [12], [13] and also implies the parameter-based version 
of Heisenberg's uncertainty relation [11]. 

From the Cramer-Rao inequality, or alternatively, from the 
central limit theorem, we know that the standard deviation of 
an estimator scales of order 1/yN where N is the number of 
samples observed from the parameterized source. Such a rate 
of convergence is fundamental and universal. It also occurs 
in parameter estimation of quantum information sources as 
pointed out, for example, in [11]. In the physics literature, the 
scaling of order fl(l/^/N) is sometimes called the standard 
quantum limit or the shot noise limit. A fascinating aspect 
of the quantum case is that such a limit can be beaten! 
Namely, if instead of estimating parameters of a quantum 
information source, we are interested in knowing to some 
precision parameters of a quantum channel, then it is possible 
to have the scaling of 0(1/N) where N stands for the number 
of times the channel being used. The new scaling is the so- 
called Heisenberg limit and accounts for a quadratic speedup 
in the estimation compared to the standard quantum limit. This 
important observation of fast estimation arises recently in a 
bunch of papers which is motivated by applications in the most 
diverse fields: quantum clock [14], clock synchronization [15], 
[16], transfer of reference frame [17]— [19], and so on [20]- 
[26]. For a more complete enumeration, see the recent survey 
papers [27], [28]. 

Parameter estimation of quantum channels is thus special: 
there are parameters that can be estimated with a convergence 
rate never achievable in the classical theory. We will call an 
estimator superefficient if it converges faster than the standard 
quantum limit. Later on, we will also call a parameter super- 
efficient (inefficient) if it can (cannot) be estimated supereffi- 
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ciently. The research of superefficient parameters of quantum 
channel is important not only to the various applications arises 
in practice, but also to the theory of quantum information 
and statistics. It fundamentally characterizes the precision 
threshold that quantum mechanics permits in a measurement. 

However, most of the previous works focus only on fast 
parameter estimation of unitary evolutions, the noiseless quan- 
tum channels. In this paper, we will initiate the study of 
parameter estimation of a general quantum channel, including 
the unitary transform. And the emphasis of this paper is 
to characterize parameters that can or cannot be estimated 
superefficiently. 

Fast parameter estimation of unitary evolutions will be 
reviewed briefly. We will analyze estimation protocols that can 
exceed the standard quantum limit and will see the intrinsic 
relation of two seemingly different protocols. 

Next, we will provide a general criterion that rules out 
the possibility of a large class of parameters being estimated 
superefficiently. This is made possible by introducing the con- 
cept of programmability to the estimation theory. If a family 
of channels specified by some parameter is programmable, 
then any estimation protocol of the parameter cannot exceed 
the standard quantum limit. That is, programmable parameter 
of a quantum channel behaves much like a classical one, 
unable to exploit the quantum advantage. The programmability 
argument, though simple, has non-trivial implications and 
extremely simplifies the analysis. For example, an interesting 
corollary of it is that all parameters of classical discrete mem- 
ory less channels are inefficient. Another important implication 
is that the presence of depolarizing noise, no matter how 
small, will "ruin" the efficiency of estimation of all quantum 
channels. 

On the other hand, we will also apply a general technique 
that can help the construction of superefficient estimation pro- 
tocols. This technique is borrowed from Rudolph and Grover's 
method of establishing a shared reference frame. Using this 
technique, we will show that parameters in a non-unitary 
quantum channel may also be estimated superefficiently. 

This paper is organized as follows. Section [II] devotes to 
the introduction of some basic notations of estimation theory 
and quantum information theory. We will discuss the Cramer- 
Rao inequality in both the classical and quantum setting in 
this section. In Section Hill we review and analyze some of 
the protocols that estimate parameter of unitary operations 
superefficiently. A technique of parameter amplification is 
discussed in detail which will be used later in Section [V] The 
concept of programmable channels is introduced in SectionHvl 
Some of the parameter estimation problems and interesting 
corollaries are studied in this section based on the "no-go" 
criterion we propose in terms of programmability. Section [V] 
provides non-trivial examples of fast parameter estimation of 
non-unitary channels. 

II. Notations and backgrounds of estimation 

THEORY AND QUANTUM INFORMATION THEORY 

In this section, we will discuss several topics that are 
important to this work. First, we will review some of the basic 



facts of the classical theory of parameter estimation. We will 
then move on to the quantum case after a brief introduction 
to the concepts and notations of quantum information theory. 

A. The classical theory of parameter estimation 

In mathematical statistics, the parameter estimation problem 
is formalized in terms of a family of distributions f(x;9). 
Here 8 is the parameter to be estimated which belongs to a 
parameter set 9. We will only consider bounded parameter set 
for simplicity. Suppose that a sample ■ ■ ■ ,£n of size N 

is drawn from the parameterized distribution independently. 
An estimator 8 for 8 for this sample is a function of the N 
observed values £2, ■ ■ ■ , &v) valued in 9. 

The estimator is said to be unbiased if its expectation E(8) 
equals to the unknown parameter 8. Another qualitative eval- 
uation of an estimator is its consistency: we say an estimator 
is consistent if it converges to the unknown parameter in 
probability as the sample size tends to infinite. We would only 
consider consistent estimators in this paper. To evaluate an 
estimator 8 quantitatively, the mean squared error (MSE) 



E{6- 



(1) 



is usually employed. The smaller the MSE, the better precision 
the estimator promises. 

In the language of information theory, the distributions 
f(x; 8) can be thought of as the statistics of a memoryless 
source with a hidden parameter 8. For example, it can be a 
discrete memoryless source Bg with source statistics p(0) = 
1 — 6, p(l) = 8 where 8 E [0,1] is the parameter. A good 
estimator for 8 one can easily imagine is the sample mean 
8 = J^i The estimator is obviously unbiased and has a 

variance of 8(1 — ff)/N. For an unbiased estimator, the mean 
squared error is equal to its variance. Thus, we would like to 
find an unbiased estimator with as small variance as possible. 
However, the Cramer-Rao inequality sets lower bounds on the 
variance. For example, it tells us that 8 has the least variance 
possible among all unbiased estimators of 8. 

Theorem 1 (The Cramer-Rao inequality): For all estimator 

m 

(dE(8)/d8) 2 



Var(9 > 



J(8) 



Here J(8) is the Fisher information defined as 

J(0)=E[^]nf(X;d) 



(2) 



(3) 



where X ~ f(x; 8). 

When 8 is unbiased, dE(8)/d8 = 1, so we can rewrite the 
inequality as 

,2 . 1 



Ed 



> 



J(8) 



(4) 



The proof of the above theorem can be found in, for 
example, [5] or [10]. It is also easy to show that Fisher 
information J (8) is additive. Concretely, let J\(6), J%(9) be 
the Fisher information of distributions f(x;8) and g(y;8) 
respectively. The Fisher information J\2(8) of the joint dis- 
tribution f(x;8)g(y;8) is equal to Ji(8) + J 2(0)- Applying 
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this observation, we can get the Cramer-Rao inequality for 
estimators of sample size N: 



E(§- 



> 



1 



NJ(9) ' 



(5) 



It can be easily verified that for source Be, the Fisher infor- 
mation J (0) is [6(1 — 9)]~ 1 . Thus 9 is optimal as mentioned. 
In some cases, it might be possible that there does not exist 
any estimator that can saturate the lower bound in the Cramer- 
Rao inequality. However, Fisher showed that, except for some 
extreme cases, the maximum likelihood estimator can always 
achieve the lower bound in the limit of large sample size N [6]. 

A corollary of Eq. (0 which is important to this paper is that 
no unbiased estimator can have its variance converging to zero 
at a rate faster than the order of 1/N, where N is the sample 
size. In terms of the standard deviation, this is a convergence 
rate of Q(l/y/~N). In the following, we will call an estimator, 
or an estimation protocol, is of order 1/y/N (1/N, etc.) if its 
standard deviation converges with order 1/yN (1/N, resp.) 
for all possible 9. 

In the previous analysis, we have derived the rate of 
convergence from Eq. ((5]) which only applies to unbiased esti- 
mators. We now claim that biased estimators are also of order 
£2(1/ viV) in terms of the root mean squared error (RMSE) 
instead of the standard deviation. As we have assumed the 
parameter set to be bounded, E(9) converges point-wise to 9 
as N — ► oo for any consistent estimator 9. It follows from the 
mean value theorem that there exists a specific 9q such that 

dE(9) 



(19 



is close to 1 for large N. Combining the fact that 



E(9 



> Var<9 



and Eq. ©, we complete the justification of the claim. Thus, 
we have shown that any estimator of parameters of a classical 
information source is of order Cl(l/\/N). We note that the 
locally normalized deviation measure 



E 



dE(9)/d6 



(6) 



was employed to deal with the case of biased estimators 
in [10]. We insist on using MSE in this paper as it is much 
easier to calculate and provides us with a uniform criterion in 
evaluating different estimators. 

Before we introduce the quantum Cramer-Rao inequality, 
we will first review quantum mechanics form an information- 
theoretical point of view. For a more detailed presentation of 
the theory of quantum information, the readers are referred 
to [29]. 

B. Quantum information sources and quantum channels 

In quantum information theory, quantum state plays the role 
of the information carrier. Any quantum state can be described 
by a positive semidefinite operator p with unit trace. When 
diagonal, it degenerates to a discrete probability distribution 



and is thus also a natural description of a quantum information 
source. 

The evolution of a closed quantum system is characterized 
by a unitary operation U which maps p to UpW. As a 
special type of quantum channel, unitary evolution is invertible 
and noiseless. A general quantum channel is mathematically 
a superoperator £ which is completely positive and trace- 
preserving. That is, for any positive semidefinite operator p, 
I®£(p) is positive semidefinite and tr(£(p))= tr(p) where 
X is the identity superoperator. The effect of any quantum 
channel £ can be viewed as the dynamics of one part of a 
larger closed system. Namely, there always exists a unitary 
operation U such that for all p, 



£(p) = tr e „„ [U( P ® \0}(0\env)U^] . 



(7) 



Another description of quantum channels which is easy to use 
is the Kraus' operator-sum representation. In this represen- 
tation, any channel £ is specified by a set of operators Ei 
satisfying % ^l^i = an d 



E iP E\. 



(8) 



Different sets of operators, {Ei}V_\ and {Fj}"^, may cor- 
respond to the same quantum channel. When m — n, this 
occurs if and only if there exists Uij such that Ei = J2j u ijFj 
and (u^) is unitary. It is thus called the unitary freedom in 
the operator-sum representation [29]. Note that in the case of 
to 7^ n, we can append zero operators to the set having the 
smaller number of operators. 

One of the simplest quantum channels of interest is the qubit 
depolarizing channel 



£(P)=P I - + (1-P) P . 



(9) 



It is naturally the quantum counterpart of the binary symmetric 
channel. One of its operator-sum representations is specified 

by 

[y/l- 3p/4J, y/pX/2, VpY/2, (10) 

where X, Y, Z are the Pauli matrices. The Pauli matrices may 
also be denoted by <7;'s sometimes: 



I = a 



Y = <7 2 



X = en 



Z 



03 
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A special type of non-unitary quantum operation which is 
important to the interpretation of quantum theory is quantum 
measurements. Quantum measurement is the bridge that links 
the quantum and classical worlds and is the only way for us 
to obtain classical information from a quantum system. One 
of the formulations of quantum measurements is described 
by the resolution of identity / into projectors Pi's, I = 
The probability of observing k is tr(pPfc) and the 
post-measurement state becomes PkpPk/ tr(pPfe). If we do 
not care much about the post-measurement state, we can 
employ another description called positive-operator valued 
measure (POVM). Mathematically, it is a resolution of identity 
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/ into positive semidefinite operators Mi, I = 2~2^Li Mi- The 
probability of observing result k is tr(pMk). For example, 
the measurement along the basis |+) = (|0) + |l))/\/2 and 
|-) = (|0> - |1))/V2 can be modeled by P + and P_, 



P+ 



")(- 
")(- 



-1 
1 



(12) 



Measurements are quantum channels. Thus, they can also 
be described by the operator-sum representation. The above 
simple example can be specified by the following set of 
operators 

{|0)<+|,|1)(-|}. (13) 

Notice that we have chosen the measurement result, instead of 
the post-measurement state, to be the outcome of the channel. 

C. The quantum Cramer-Rao inequality 

We are now ready to introduce the quantum Cramer-Rao 
inequality which first appeared in [7]. We will sketch the proof 
for it because of its importance to one of our results. The proof 
is similar to the one given in [10]. 

Consider a quantum information source p(9) which depends 
on parameter 9. It is beneficial to divide an estimation protocol 
into two different steps [10]. In the first step, perform a 
properly designed measurement M, and in the second, make 
an estimation based on the data obtained in the previous step. 
On can see that the second step is essentially the same as 
a classical estimation protocol and the classical Cramer-Rao 
inequality applies. That is, given the POVM M = 1 
chosen in the first step, we get a lower bound that depends on 
M 

E(9 - 6>) 2 > — — s , (14) 



where 



j M {oy 



[tr(AV)]- 
tr(Mip) 



(15) 



We have considered only unbiased estimators here and the 
biased case can be analyzed similarly as in the classical case. 

Write the spectrum decomposition p = '}2 li Pi\i){i\ and 
define a superoperator C p as 



Pj + Pk 



Ojk\j)(k\. 



(16) 



An important property of C p is that for non-singular p, and 
Hermitian matrices A and B, 



tr(AB) = Re[tr(pAC p (B))] . 
It follows by substitution that 

r Ke[tr(pM i L p (p'))]y 



tr(M iP ) 



(17) 



(18) 



The validity of this substitution for singular p is justified 
in [10]. Hence, 

|2 



J M (0) < £ 

2=1 

m 

= £ 



\tx(pMiL p (p'))\ 
tr(M lP ) 



tr 



• y/tT [M iP ) 



(19) 



= tr(C p (p')pC p (p')) 

= tv(p'C p (p')), 

where the second inequality follows from the Cauchy-Schwarz 
inequality. 

The term in the final step of Eq. ( fT9l is the quantum Fisher 
information 

J(9)=tr(p'£ p (p')), (20) 

which depends only on the parameterized state p and we may 
also denote it by J p (0) for clarity. 

The following theorem follows from Eqs. ( fT4l i and ( fT9l ). 

Theorem 2 (The quantum Cramer-Rao inequality): For 
any unbiased estimator 9 for 9 of p(9), 

E { e-8f>±-. (21) 

Next, we show that the quantum Fisher information is also 
additive. That is, 

J p {9) = J a {9) + J T {9), (22) 
if p = a <g> t. The proof is simple. As p' = a' <2) t + a ® t' , 
tv(p'C p {p')) = tr(a'£ a (a') ® r + a' ® tC t {t') 
+ a ® t'£ t (t') + aC a (a') ® r') , 

and Eq.(f22t follows by noticing that tr(cr') = tr(r') = 0. 

Therefore, if N replicas of p, p® N , is used in the estimation, 
we have the corresponding Cramer-Rao inequality 

E{9-9) 2 > ATT , n , . (24) 



(23) 



9)2 ~ NJ(9Y 

This means that any unbiased estimator is also of order 
51(1/ V^V) and so is the biased case by a similar argument 
used before. We would also like to point out that the above 
analysis applies to any joint measurement on the N copies 
as mentioned in [11]. Thus, the standard quantum limit is 
essential for all parameters of quantum information sources. 
We will refer to this result later in Section [TV] 

III. A REVIEW OF PARAMETER ESTIMATION FOR UNITARY 
OPERATIONS 

Unitary evolution is one of the most fundamental operations 
in quantum information, and has therefore received the most 
attentions. It is also the first type of operations studied in 
parameter estimation of quantum channels. The most amazing 
observation is that, unlike parameters of both classical and 
quantum information sources, parameters of unitary operations 
can be estimated much faster! Two different approaches of 
superefficient estimation are studied in the following. One of 
them is of order l/N; the other is of order log N/N. 
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A. Strategies that can beat the standard quantum limit 

The parameter 9 now determines a unitary U (9). To estimate 
the value of 9, we will apply the unitary N times to some states 
properly prepared. Then, the problem becomes parameter esti- 
mation of states which we are more familiar with. However, it 
is much more flexible to employ an operation in an estimation 
protocol: it can be carried out in parallel, sequentially or even 
by mixing both of the two. In the following, we will discuss 
some of the superefficient strategies that are of common use. 

The first simple strategy we discuss carries out the unitaries 
in parallel. See Fig. Q] for a demonstration of the layout. 
It will be referred to as the parallel strategy. Now, if the 
input are chosen to be product states of pi, p%, . . . , p^, we 
will show that no estimation protocol can beat the standard 
quantum limit no matter what kind of joint measurement and 
the post-measurement estimator one chooses. This is seen by 
the additivity of quantum Fisher information and the quantum 
Cramer-Rao inequality. 



Fig. 1: Layout of the parallel strategy 



Before the measurement, the state can be written as 



N 



[u(o) Pi u\o)\ , 



(25) 



whose Fisher information of 9 is 



A' 



J u P > ui(6)<N max { J UpW (9) } . (26) 



The convergence rate of n(l/y/N) follows immediately from 
the quantum Cramer-Rao inequality. This result is also noted 
in [22] as the so-called CC and CQ strategies considered there. 

If the input of the N parallel unitary operations is chosen 
to be some entangled state, the argument above does not work 
anymore. In fact, we can find estimations of order 0(1/ N) 
with the help of quantum entanglement. 

To make the analysis simpler, we will focus on the estima- 
tion of 6 G = [0, 1) of a single-qubit unitary 



U = 



1 

- 





2iri6 



(27) 



Yet, we claim that this simple case is essentially as general as 
the estimation of the angular parameter of U — e H where 
H is a known Hermitian operator independent of 9. In the basis 
of eigenvectors of H, U has a diagonal matrix representation 
and operates as a single-qubit unitary defined in Eq. (|2Tb when 
restricted to a two dimensional subspace. We therefore do not 



lose much by confining our attention to U defined in Eq. A271) , 
The estimation of angular parameter, though simple, has wide 
applications in physics [9], [14], [22]. 

In Holevo's book [9], the optimal estimation of parallel 
strategies was found for angular parameter of U — e~ lBH 
based on a theory of covariant measurements. The result was 
employed recently by Buzek, Derka and Massar in designing 
optimal quantum clocks [14] which can achieve the Heisen- 
berg limit. This speedup was found by optimizing the input 
state that is fixed in Holevo's result. We will present an 
analysis which is similar to the one given by Hayashi [20] 
but will appeal to the Fourier basis measurement instead of 
the covariant measurement. The Fourier basis measurement is 
in fact one of the discrete versions of the covariant measure- 
ment [30]. The procedure is illustrated in Fig. |2] 



FT* 



U e 



Fig. 2: Layout of the parallel strategy with entangled inputs 
and Fourier basis measurement 

Consider U(9) given in Eq. d27l ) and define N + 1 special 
states in the space on which JJ® N acts: 



I*-) 



k S |!> - 

( j ) l:w(l)=k 



(28) 



for k = 0, 1, . . . , N where w(l) is the Hamming weight of I 
By virtue of the parallel structure, we have 

U® N \k) = e 2k ™ e \k). 



This means that JJ® will rotate the input state 



N 



dk\k), a,k is real 



fc=0 



to 



a k e \k), 



(29) 



(30) 



(31) 



where the afe's will be given later. The inverse Fourier trans- 
form 



\k) 



brings the state further to 



1 N 



-2nikl/{N+l) 



i N 



ake 



2kTTi(e-i/(N+i)) 



\i). 



(32) 



(33) 



k,l=0 



Finally, perform the computational basis measurement on each 
of the qubit, and estimate 9 with the number of l's in the 
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measurement outcome divided by N + 1. This completes the 
description of the parallel strategy. 

What remains to be clarified is the efficiency of this 
protocol. We will choose the expectation, denoted by W, of 
1 — cos(27r(6' — 9)) instead of the MSE. One can see that 
these two evaluations are equivalent as 1 — cos(27r(6> — 9)) 
is asymptotically 2tt 2 (9 — 9) 2 when (9 — 9) is small. The 
choice of this type of evaluation function, which is first used 
in Holevo's book [9], helps to simplify the calculation of a 
close form formula of the averaged deviation. 

As the probability of observing I l's in the output is 

2 

2kiri(6-l/{N+l)) 



Pr(Z) 



N 
fc=0 



(34) 



N 



1 V n n P ^(e-l/(N+l))(m-n) 



m,n— 

the expectation 

W = £(1-cos(2tt(0-0))) 

N 

= ^Pr(Z) 1-cos(2tt((9 



N 



(35) 



N 



= l-^Pr(0 COS (2n(9- 7 — J )). 



Employing Eq. fl34"l > and the fact the LHS of the above equation 
is real, we can continue the calculation as 



1 N 

1 - j—£ Rc ^ a m a n e 27r 



i(8-l/(N+l))(m-n+l) 



W = l- 

\ + I 

/,m,n— 

JV 

= 1 — ak-icik — ao a Af cos(2ir(N + 1)9). 



k=l 



If we choose ao = 0, W will be independent of i 



N 



w 



i - 22 ttfe-iot- 

k=2 



(36) 



(37) 



Now, we need to minimize W subject to the normalization 
condition 

N 
k=l 

One can see that the minimum value of W is equal to the 
minimum eigenvalue the N by N matrix A whose diagonal 
elements are all 1 and secondary diagonal elements are all 

-1/2: 

' 1 -i 



.4 



1 



(39) 



2N+2 



and the corre- 



The minimum eigenvalue of A is 2 sin 2 
sponding eigenvector gives the values of a^'s for k > 1: 



kit 



N + l N + l 



■ sin ■ 



(40) 



For large N,W = 2 sin 2 



2N+2 



is obviously of order 1/N 2 



Thus the estimation protocol given above is of order 0(l/iV) 
in terms of RMSE. 

Our next strategy mixes both parallel and sequential parts 
but still has a simple structure as depicted in Fig. [3] Namely, 
it prepares n qubits in parallel and applies the unitary U on 
the jth qubit 2 J ~ 1 times and thus N = 2™ — 1 times in total. 
It is easy to see that before the inverse Fourier transform, the 
state of the n qubits is 



N 

E 



due 



2kTTi6 



(41) 



k=0 

|iven that the initial state is 

JV 



J2 a k\k)- 



(42) 



k=0 



The above two equations are have the same form of Eqs. (f30b 
and d3Tl ). Thus the analysis of the mixed strategy presented 
here will be essentially the same as the parallel strategy though 
they look quite different. It is worth noting that, recently, a 
similar strategy by optimizing the input state is discovered 
independently in [31] which dramatically improves the average 
efficiency of the phase estimation protocol proposed in [32]. 



Ue-Ue 



Ue-Ue-U -Ue 



FT* 



Fig. 3: Layout of a mixed strategy based on Fourier transform 

We have now seen how an estimation protocol beats the 
standard quantum limit. A more ambitious question is whether 
it is possible to find even better protocols which converge 
faster than the order of 1 /N. Unfortunately, it has been proven 
impossible, for example, in [22] by employing an uncer- 
tainty relation implied by the quantum Cramer-Rao inequality. 
Though the proof there considers only unitary operations, the 
result applies to general quantum channels because of Eq. (0. 
The scaling of 1/N k is reported recently in a quite different 
problem setting [33] and makes no contradictions. 

Unlike the previous strategies we have discussed, another 
important class of the strategies has much looser structures. 
The spirit of it is trying to accumulate the parameter before 
we observe. In respect that this class of strategies is closely 
related to one of the main result of this paper, we organize the 
treatment of it in a separate part. 
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B. The technique of amplifying parameters 

It has been widely noticed that angular parameters of a 
unitary can be easily accumulated either with or without 
entanglement [16], [17], [22], [27]. Again, let U be 





,,2-rrie 



(43) 



but consider only 9 E 9 = [0, 1/2] for simplicity. By applying 
it in parallel to the n-qubit GHZ state [34] 



one gets 



|00 - • ■ 0) + |11 • - - 1) 

V2 



|00 - - - 0) +e 27rme |ll---l) 



V2 



(44) 



(45) 



which is a state that depends on n9. The parameter is thus 
amplified. 

Angular parameters can also be amplified without entan- 
glement. To see this, we employ a sequential strategy that 
applies the same U n times on state (|0) + |l))/v / 2- The 
amplification is also done since the state is finally rotated to 



(10) 



1»/V2. 



Any estimation of N9 provides also an estimator for 9 
simply by dividing the estimated value by N. Moreover, if 
the estimation of N9 has RMSE bounded by constant, the 
estimation of 9 seems to be of order 1/N. However, it is 
noticed in the literature that this argument is not rigorous [16] 
and only applies for 9 small enough (compared to 1/2 N). 
The reason is that e 27 ™ is periodical and we cannot always 
decide the value of 9 from e 2jrlNe . 

Fortunately however, there is an ingenious way that can 
deal with this problem. Rudolph and Grover proposed an 
iterative procedure which can determine the first k bits of 
the parameter [17]. They used this method to establish a 
shared reference frame between two remote parties. Later, 
Burgh and Bartlett used it in their clock synchronization 
protocols [16]. This technique, which can be proven to be of 
order 0(log N/N), is in fact a general method for parameter 
estimation provided that the parameter can be amplified. 

Yet, there is a loophole in Rudolph and Grover's bitwise 
protocol which makes the protocol problematic sometimes. In 
their paper, T and T' are the parameter to be estimated and a 
possible estimation respectively. They assume that \T — T'\ < 
1 /4 implies that T' agrees with T to at least the first bit. This 
is generally not true no matter how close T' and T are. A 
careful verification tells that if the true parameter T is 1/2, 
their protocol is not even a consistent estimation, that is, T' 
does not converge to T in probability. We will modify the 
protocol to close the loophole. 

The modified protocol still contains k steps. In the first 
step, we prepare state (|0) + |l))/\/2, apply U once, and 
measure along the Hadamard basis |+), |— ). The probability 
of observing + is P + = cos 2 (ir9). Repeat the procedure n 
times and calculate the sample mean P as an estimation for 
the value of P + . Let the parameter corresponding to P be 9, 
that is, P — cos 2 (tt9). Obviously, there exists some constant 



6 such that \P + - P\ < 6 implies \0 - 0\ < 1/12. Choose n 
large enough to insure 



Pr[|P+ - P| < 6} > 1 - e/k. 



(46) 



Thus, with at least the same probability, \9 — 9\ < 1/12. 
Consider the following three cases depending on the value 
of 9 calculated. 

1) If 9 e [0, 5/12), the probability of 9 E [0, 1/2] is at least 
1 — e/k. Define n = 2 and v\ = in this case. 

2) If 9 e [5/12, 7/12], the probability of 6 e [1/3, 2/3] is 
at least 1 — e/k. Define r% = 3, v\ = 1. 

3) Otherwise, 9 E [7/12, 1], the probability of d G [1/2, 1] 
is at least 1 — e/k. Define r\ = 2, v\ = 1. 

This finishes the first step. 

In the first step we have insured that the true parameter 
belongs to an interval of length \/r\ with high probability. 
We would continue this idea in the following steps. In the 
second step, we still prepare (|0) + |l))/v / 2, but apply U r\ 
times instead, where r*i is determined in the previous step 
whose value is either 2 or 3. The following is similar to the 
first step if we regard the decimal part of r\9 as 9. The second 
step determines the value of r-i E {2,3} and vi in a similar 
way. After the second step, the "possible" interval of the true 
parameter is of length l/r\r>2- In the third step, the unitary U 
is carried out sequentially rir2 times each trial. Similarly, U is 
applied Jl^i 1 r i ti mes each trial in the kth step. After all the k 
steps, we can make sure that the parameter 9 is in an interval 
of length 1/Ili=i r i w ^h probability at least (1 — e/k) k . We 
conclude the whole procedure by accepting 



i=i i=i 



(47) 



as the estimated value, represented in a mixed radix system. 
To ensure Eq. ( T4"6j l, we can choose 



n > _Li n2fc / e , 

which follows from the Chernoff inequality 

Pr[|P-P| > 8} < 2e~ 2nS2 



(48) 



(49) 



Thus the total number of times that U is applied in the k 
steps is 

N = n(l + n + nr 2 + ■■■ + nr 2 ■ ■ ■ r fe _i). (50) 

We will prove that the protocol given above is indeed an 
superefficient estimation of order 0(logjV/7Y). After all the 
k steps, the probability 



Pr 



0-9\ < l/Hn]^ (l-e/k) k > 1-e. (51) 



i=l 



Thus, the MSE is bounded as 



E{9-9) 2 < JJrr a + c .l. 

i=l 



(52) 
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Set the value of e in the protocol to 3 2k and notice the fact 
that r, € {2, 3}, we have 

k 

E{d -df <2\{rl 2 . (53) 

It is easy to prove by induction on k that 

l + + ^1^2 H 1- rir 2 ■ ■ ■ r fc _i < rir 2 ■ ■ ■ r>. (54) 

From this inequality, Eq. (|5Q| > and the fact that both n and 
log N is of order k, it follows that the RMSE of our protocol 
is of order OQogN/N). 

One can see that the protocol depends only on the ability 
to amplify the parameter with linear cost before observing it. 
Thus it can be applied to any problem where the amplification 
is possible. We will see later a quite different use of this 
technique. 

Besides the generality, the amplification protocol used here 
is superior also in that it does not require the help of any 
entanglement which is indispensable in the parallel strategy. 
It even requires no joint measurement. This may make it easier 
to implement experimentally than the other protocols we have 
mentioned. 

As a conclusion of this section, we have discussed various 
strategies that can beat the standard quantum limit in the case 
of estimating parameter of a unitary operation. It is necessary 
to mention that, as a generalization, it is proved recently that 
any unknown unitary can be estimated efficiently of order 
1/N [17], [18], [21]. Thus, all parameters of unitary operators 
possess a quite non-classical property, the easiness of being 
estimated. However, we will see that this is not always true 
for general quantum channels. 

IV. Programmability and efficiency 

In this section, we will begin to discuss the estimation 
of parameters of a noisy quantum channel. We first give a 
formal definition of the problem. As introduced in Section [II] a 
quantum channel is a completely positive and trace-preserving 
superoperator. The set of all quantum channels that maps den- 
sity operators of Hm to densities of Tiout forms a continuous 
manifold D. The unknown parameter belongs to set G, a 
continuous manifold of finite dimension. Finally, a continuous 
injection £ : 6 — > D defines the family of parameterized 
quantum channels {£g | 9 6 0}. 

A. Programmable quantum channels 

The idea of programmable gates stems from the design 
of digital circuits. It provides the convenience to change the 
functionality of a gate by the control over some of its inputs. 
Theoretically, it is possible to program all Boolean functions 
of n bits into a single gate. It is an interesting question to ask 
whether this is also possible for quantum gates [35]. However, 
even the number of all the quantum gates on single qubit 
is uncountably infinite. Therefore, one cannot use classical 
controls to achieve this. But does the use of quantum programs 
help here? Nielsen and Chuang [35] gave a negative answer 
to this question. 



To be precise, we define the notion of programmable gates 
as follows. Let {£e} be a family of quantum channels. It is 
called programmable by {{pe},G) if there exist a family of 
quantum states {pe} of a finite dimensional space H, and a 
quantum gate Q that does not depend on 9 such that 

£e( P ) =tT A ,{Q{pj®p B )), (55) 

for all 9 and p. In this definition, system A stores the quantum 
program and system B receives the input data. The output B' 
is not necessarily equal to B. This definition is illustrated in 
Fig. [4] One can always choose gate Q to be unitary without 
loss of generality. When {£e} is programmable we will also 
say that the parameter 9 is programmable. 



P 



pe — A 



B 



Q 



A 1 



B' 



£o(p) 



Fig. 4: Illustration of the definition of programmable channels 

What Nielsen and Chuang proved in [35] is that the 
family of all unitary operations acting on m qubits is not 
programmable. This fact follows from the linearity of quantum 
mechanics. Fortunately, they also pointed out the possibility 
to program the family of unitaries in a probabilistic way. 
Vidal, Masanes and Cirac give a more elegant construction 
to implement unitaries probabilistically [36]. What is more, 
the success probability can be exponentially small when the 
size the quantum program grows. We will present the protocol 
of Vidal et al. in the following. It is important to note that 
when we call a channel programmable, we mean "exact" 
programmability as indicated by the above definition, not the 
"probabilistic" compromise discussed here. 

Generally, to prove that some family of channels is pro- 
grammable, we need to specify two things: the quantum pro- 
gram pe and the quantum circuit Q. The former can be written 
out directly. And we will describe the latter using the Quan- 
tum Computation Language (QCL) designed by Omer [37]. 
The language has a syntax derived from classical procedural 
languages like C or Pascal. The main quantum features of 
it used in this paper are the quantum data type qureg, 
which is an array of qubits, and the statement measure 
q[,var] ;, which measures the register q and assigns the 
result to the integer variable var if specified. We will need one 
more statement discard, which is not included in QCL, to 
represent the partial trace operation in Eq. d55l ). The comments 
interlaced in the codes may help if one is not familiar with 
QCL. 

Listed in Fig. [5] is the way of Vidal et al. implementing 

so 



U : 







(56) 



with probability 1 - 2~ fc [36]. 
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The quantum program pg, referred to in the code as prog, is chosen to be 

(g)( e 2m ^|o) + e - 2m ^|i)), 

m=l 

and circuit Q is specified in the following code: 

int probunitary (qureg prog, qureg in) 

{ 

int r; 

int k = #prog; // set k to the length of the program 

for i=0 to k-1 { 

CNot (prog [i ] , in [ ] ) ; // apply the CNot gate, in[0] is the control bit 
measure prog[i],r; // measure the ith qubit of prog, store the result in r 

if r==0 { return 1; } // success! 

} 

return 0; // fail 



Fig. 5: The probabilistic implementation of U{9) 



B. A no-go criterion and its applications 

We propose in the following a simple criterion that charac- 
terizes a large class of inefficient parameters. The criterion is 
another application of the quantum Cramer-Rao inequality. 

Theorem 3: Any estimation of programmable parameter is 
of order to(l/VN). 

Proof: Suppose {Eg} is programmable by ({pe},G)- 
We need to show that any estimation of the parameter 9 in 
Eg is of order fi(l /vN). The main point here is that any 
protocol V estimating 9 of channel Eg can be easily translated 
to a protocol V' that estimates the parameter of state pg with 
the same efficiency. The only difference in V and V' is that 
whenever V employs the channel Eg, protocol V' takes a new 
copy of pg and applies gate Q. If the channel Eg is used N 
times in protocol V, V will be a protocol estimating parameter 
9 using N copies of pg. As implied by the quantum Cramer- 
Rao inequality, protocol V' is of order fl(l/\/N), and so is 
protocol V . ■ 

We will now give some of the important applications of the 
criterion stated in Theorem [5] beginning with simpler ones. 

The first example considers the problem of estimating the 
noise level of a qubit depolarizing channel parameterized by 
9. That is, 6 = [0, 1], and 

£e{p) = 1 - + (1 - 9)p. (57) 

It is obvious that the larger the value of 9, the noisier is the 
channel. We note that the problem of optimally estimating 
parameters of a depolarizing channel was studied in [38]. 

This family of channels is easily seen to be programmable. 
We list the code in Fig. [6] An immediate consequence is that 
parameter 9 in Eq. ( [57b can never be estimated superefficiently. 

To guarantee the channel defined in Eq. d57| i to be com- 
pletely positive, 9 can vary between and 4/3. Therefore, we 
can choose the parameter set to be a larger set 9 = [0,4/3] 
and the parameter is still programmable. However, we will not 



give the construction of it here because it is a special case of 
the Pauli channels discussed below. 

We have seen that the noise level of a depolarizing channel 
cannot be estimated superefficiently. Similarly, any parameters 
that play the role of probabilities, or functions related to 
probabilities, can be programmed and are thus inefficient. 
Another example falling in this type is the estimation of 
parameters of the Pauli channel, 

3 

£e(p) = ^Pi(9)aip<7i, (58) 

i=0 

where the cr^'s are the Pauli matrices defined in Eq. ( fTTT ). The 
programmable implementation of this family is depicted in 
Fig.0 

As mentioned, the family of depolarizing channels defined 
in Eq. d57l i is in fact the family of Pauli channels with 
po = 1 — 30/4, pi = P2 = P3 = 9/4:, so are the bit flip 
and phase flip channels. The problem of parameter estimation 
of this family is discussed in [39]. Using our criterion, we 
immediately understand that no matter how hard we design the 
estimator, the estimation protocol we can obtain is as efficient 
as the most trivial one, tomography for example, up to some 
constant factor. 

There are some other simple examples that can be analyzed 
using Theorem [3] For example, the channel 

£g(p)=9\0)(Q\ + (l-9)p, (59) 

is programmable and thus 9 is inefficient. Another similar 
example is defined as 

E e (p)=e\e)(0\ + {l-e)p, (60) 

where e is known and \9) is some pure state depending on the 
parameter. 

The only family of qubit channels of common interests 
whose estimation efficiency cannot be characterized using 
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The quantum program, prog, is chosen to be 

(VT~e\o) + Ve\i)) ® (loo) + |n»/V2, 

and the following code specifies the circuit: 

procedure depolarizing (qureg prog, qureg in) 
{ 

int r; 

measure prog[0],r; // measure the first qubit of prog, store the result in r 

if r==l { 

Swap (prog [ 1 ], in [ ]) ; // apply the Swap gate 

} 

discard prog; // trace out prog 



Fig. 6: The programmable implementation of the depolarizing channel 



The quantum program, prog, is chosen to be 



and the following code specifies the circuit: 



procedure Pauli (qureg prog, qureg in) 

{ 



8=0 



int r; 

measure prog,r; 
if r==l { 
X(in[0] ) ; 

} 

if r==2 { 
Y(in[0] ) ; 

} 

if r==3 { 
Z (in[0] ) ; 

} 

discard prog; 



// measure prog and store the result in r 

// apply the X gate 

// apply the Y gate 

// apply the Z gate 

// trace out prog 



Fig. 7: The programmable implementation of the Pauli channel 



Theorem|3]is the amplitude damping channels. The following are programmable for any fixed p 6 (0, 1). We will prove it 
Kraus' operators give the parametrization: in Section IIV-DI 



E 



1 
VT^ 



El = 







(61) 



It is proved in [40] that the family of amplitude damping 
channels is not programmable. Therefore, Theorem[3]does not 
apply any more. Interestingly, however, the so called gener- 
alized amplitude damping channels having Kraus' operators 



as 



Eo = VP 



1 







VT 



E!=VP 



o y/6 

o o 



E 2 = V^P 



VT~e 6 
o 1 



E 3 = Vl^p 




y/6 



(62) 



C. Classical channels are all programmable 

We will show in the following that classical information 
channels are quantum programmable. Interestingly, this gives 
a simple proof for the fact that parameters of any family of 
classical channels cannot be estimated with a convergence rate 
better than 1/y/N. 

A discrete memoryless channel is characterized by a set of 
transition probabilities p(y\x) which define an m by n stochas- 
tic matrix Q = (q xy )- Just like any classical computation can 
be thought of as a special quantum computation, we can also 
regard the DMC Q as a quantum channel Q. Channel Q will 
assume restricted inputs, namely, diagonal density matrices, 
and will guarantee the output to be diagonal densities too. 
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The main idea of the programmable construction of Q is to 
simulate the transition probabilities after measuring the input 
in the computational basis. The construction is given in Fig. [8] 

Thus, it follows from Theorem [3] that any parameter of a 
DMC cannot be estimated superefficiently. This is in contrast 
to the quantum case. Table Q] illustrates the interesting circum- 
stances of whether a parameter is superefficient or not. 





Source 


Channel 


Classical 


NO 


NO 


Quantum 


NO 


Possibly YES 



TABLE I 
Superefficient parameters 



D. Depolarizing noise ruins the efficiency 

Our last application of the "no-go" criterion considers the 
family of quantum channels that has the following form: 



£e(p) = epo + (1 ~ e)Ue{p), 



(63) 



where e € (0, 1], po is independent of 9 and 1Aq is another 
family of channels. Let Ug be a family of unitaries first. For 
example, 



U e {p) = UepUl, U e 











(64) 



When po is I/d, the channel £g is an imperfect implemen- 
tation of Ug disturbed by the depolarizing noise. If e is close 
to 0, then £g is intuitively also close to Ug. As we have seen 
that 9 in Ug can estimated of order 1/N, we may expect that 9 
of Eg can also be estimated superefficiently. However, we will 
see that no matter how small e is, this cannot happen. That 
is, the depolarizing noise has totally ruined the efficiency of 
estimations! 

The proof is simple. As we have shown how to program uni- 
tary operation with arbitrarily high probability in Section llV-AI 
we can construct the programmable realization of Eg easily: 
implement Ug probabilistically and output po on failure. See 
Fig. [9] for details. 

The above negative result casts shadow on all of the fast 
estimation protocols proposed so far. As a small amount of 
depolarizing noise is unavoidable when conducting experi- 
ments of estimation, it seems that we will not be able to 
beat the standard quantum limit in practice. However, as long 
as the noise is indeed small and the size TV is not too large 
to amplify the noise to a noticeable magnitude, it will still 
be possible to exploit the quantum advantage in the fast 
estimation protocols [16]. That is, there is a trade-off between 
the noise level and the sample size to preserve the speedup. 
We analyze the modified bitwise estimation protocol as a 
demonstration. 

Consider the family of channels 

Ee{p) = e I - + {\-e)U e pUl (65) 



with Ug defined in Eq. j43l : 

"1 









We will see that the scaling of log N/N is preserved if 

Ne < 1, (66) 

which represents the trade-off rigorously. To see this, notice 
by induction on rn that 



(i-(i- e r)- + (i- e riw^ 



(67) 



It implies that the corresponding probability P' + and P'__ are 
related to their noiseless counterpart as 

1 - (1 - e) m 

K = — V^- + (i - t) m p+ 

i - (i _ er (68) 

PL = K — '— + (1 - e) m P_. 



As m is at most Yi r i 
have 



0(N/ log N) in the protocol, we 



(1-6)' 



> 1 



m 

N^ 1 



0(1/ log N). (69) 



That is, (1 — e) m is close to 1 for large N, and thus P± are 
also close to P±. The efficiency of order 0(log N/N) remains 
as claimed. 

As it is proved in [36] that any unitary operation can be 
programmed with arbitrarily high probability, one can easily 
see from Eq. (0 that any quantum channel can also be 
programmed probabilistically. Thus, we can easily generalize 
the result to quantum channels of form Eq. (l63l where Ug are 
on longer restricted to unitaries. 

In the rest of this section, we give a more direct characteri- 
zation of channels suffering from depolarizing noise and use it 
to discuss the estimation problem of the generalized amplitude 
damping channels. 

It is well known that there is a correspondence between a 
quantum channel £ and its state representative (a.k.a. Choi 
matrix) X(£>£(|\f r )(\l/|), where \ty) is the maximally entangled 
state. Superoperator £ is complete positive if and only if its 
Choi matrix is positive semidefinite [29]. Using this corre- 
spondence, it not difficult to show the following Lemma and 
we omit its proof. 

Lemma 1: Let £ be a quantum channel with a Choi matrix 
C. There exists some constant e G (0, 1) and quantum channel 
U such that 

£ (p) = ep + (1 - e)U(p), (70) 



space of C is contained in that of 

d-l 



if and only if the null 

|+)(+| ® Po, where |+) is state J2i=o \i)/Vd, 

An immediate corollary of the Lemma is that when the Choi 
matrix of a channel £ is positive definite, £ suffers from depo- 
larizing noise. We use this fact to show the programmability 
of generalized amplitude damping channel for fixed p S (0, 1) 
and parameter set = [a, 1], a > 0. The Choi matrix of the 
generalized amplitude damping channel is 

'1-e+pO ^JT~9 

1 9-pO 

2 p9 

y/T~9 l-pd 



(71) 
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Assume without loss of generality that n = 2 k . The quantum program, prog, is chosen to be 

~m— 1 n— 1 

(g)Ev^ly> ®|o>, 

_x=0 j/=0 

whose last part consists of fc qubits and the following code specifies the circuit: 



procedure dmc(qureg prog, qureg in) 

{ 

int r; 

measure in,r; // measure the input and store the result in r 

measure prog [r*k : : k] ; // measure the subregister of length k started at r*k 
Swap (prog [r*k : : k] , prog [m*k : : k] ) ; 

// Swap two subregisters of prog 
discard prog [ : : m*k ] ; // prog [m*k : : k ] , the last part of prog, remains 

// and will be the output 

discard in; 



Fig. 8: The programmable implementation of classical DMC. 



Choose k such that e > 2 . The quantum program, prog, is chosen to be 

k 

I (e 2m " w |0) + e - 2m ~ le *|l)) ® po ® (|1)(1| + (e - 2- k )Z), 



and the following code specifies the circuit: 



procedure dnoise (qureg prog, qureg in) 

{ 

int r; 

measure prog [k+1 ] , r; // measure the last qubit of prog, store the result in r 
if probunitary (prog [ : : k] , in)==0 or r==0 { 

Swap (prog [k] , in[0]); // Swap (with probability \epsilon) 

} 

discard prog; 



Fig. 9: The programmable implementation of Ee 



whose eigenvalues are p9/2, (9 — p6)/2 and 

2 - 9 ± y/(2 - 9) 2 - 4p(l - p)6 2 



(72) 



It is easy to verify that all these eigenvalues are strictly pos- 
itive. The programmability therefore follows from the above 
Lemma and the fact that channels suffers from depolarizing 
noise are programmable. 

V. Fast estimation of parameters in noisy 

CHANNELS 

Up to now, we have seen that parameters of unitary oper- 
ation can be estimated superefficiently. We have also shown 
many examples of quantum channels in which superefficient 
parameter estimation is impossible by the "no-go" criterion 
we provide. We will now answer the question of whether fast 



parameter estimation can occur in non-unitary channels, or it 
is only a unique phenomenon for unitaries. Of course, there 
are trivial positive examples: £${p) = | ®Ug(p) where | is 
maximally mixed state and Ue is unitary. What we will discuss 
in the following is not of this kind. 

We first give the family of channels of interest and the 
protocol to estimate the parameter. Define an intermediate 
matrix E as the qubit rotation 



E 



cos 01 + i sinOY 
cos 9 sin 9 
— sin cos 9 



(73) 



where 9 £ [0, 7r/2] and define E a and E\ in terms of E as 

E Q = ^mE, E 1 = y/l - m ZE. (74) 
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Here, i]g is any real function of 9 that ranges in [0, 1] and Z 
is one of the Pauli matrices defined in Eq. (fTTT i. 

Now, we can write out the family of channels explicitly 
using the operator-sum representation: 



£ e (p) = E oP El 



E lP El 

with 9 e 6 = [0,7r/2]. 

When r\g — 1/2, a constant function, this family of quantum 
channels is a collection of single-qubit observables, which is 
indeed far from unitary evolutions as claimed. Specifically, it 
easy to check that, in this case, 

|O)(0 O |= 1/V2E + 1/V2E 1 

|l)(0i| = -l/V2Eo + l/V2Ex 

where |0 O ) = cos6»|0> + sin0|l) and |0 X ) = sin0|O) — cos6»|l) 
As the matrix 

1 T i il 

(77) 



(75) 



(76) 



1 

71 



i 



-i 



is unitary, it follows from the unitary freedom of operator-sum 
representation that, when rje = 1/2, we can write 



£g{p) = \O)(0 o \p\e o )(O\ + \l)(9 1 \p\t 



(II 



(78) 



an operator-sum representation of projective measurement 
along the basis {|0o), |0i)}- We will thus call the family 
defined in Eq. d75T l the projector-class channels. Denote by 
Mg the special case of the family of r\g = 1/2 and denote by 
M = Mq the measurement along computational basis. Our 
result can be summarized in the following theorem. 

Theorem 4: Parameter 9 of the projector-class channels is 
of order 0(\ogN/N). 

To prove the theorem, We will specify the protocol by 
employing the amplification technique introduced in Sec- 
tion IIII-BI It is only necessary to construct a distribution 
related to the amplified parameter n9 by n uses of the channel. 
We will achieve it in two steps. 

First, prepare an n-qubit entangled state 



i*»> = E (-ir (4)/2 i*)/^2^, 



(79) 



ieE„ 



where E n is the set of all n-bit strings of and 1 with even 
parity and w(i) is the Hamming weight of i. This state is first 
used in [41] to identify observables by the authors. 

Next, apply the channel Eg on each qubit of the state and 
measure in the computational basis. 

We now calculate the probability of measurement outcomes 
having an even parity Pr(et>en). The following identities can 
simplify the analysis: 



Mo £g = Mi 



M oi 



(80) 



where Ug corresponds to the unitary operation |0) (#o|+| 1) ($i I • 
The second equality is obvious and we prove the first one 
only. M o Eg has a representation consisting of the follow 
four operators: 

|O)(O|£ o = Vw|O><0o| 



|£i = V T ^m\o)(9 \ 

|l)(l| J Bo = -V^|l)(^i| 

ii)(i|E 1 = yr^|i)(0 1 |. 



(81) 



The first two terms can be merged to one of the operators of 
Mg, so is the second two. The equality is thus proved. 

An implication of this identity is that Pr(even) remains the 
same if we substitute Ug for Eg in the second step of the above 
protocol. That is, 



Pr(even)= £ 0W n (l*»X*»l)li> 

j£En 

= £|0F® n |*n)f, 
j£E„ 

where U= |O)(0 O | + |l)(0i|. 

The element of the jth row and k column of 

(_l)t«(3-*) ( cos 0)n-d(i,*O ( sin Q)d(j,k) : 



(82) 



(83) 



where d(j, k) is the Hamming distance function and j ■ k is 
the bitwise AND of j, k. Consequently, (j\U® n \^> n ) equals to 



V2 



= y (-i) 

>n-l ^ v ' 



w(k)/2, 



(-l) w ^'- fc )(cos( 



\n-d(j,k) 



(sin 9) 



d(j,k) 



k£E n 



(84) 

Notice the fact that 2w(j ■ k) + d(j, k) = w(j) + w(k), the 
above quantity can be further simplified as 

( _lg)/2 ^ ( _ 1) -«ia,fc)/2 (cosfl) n-«iy,fc) (H j I1(9) «iy,fc) < 



keE„ 



(85) 

We can write the summation without the multiplicative con- 

(86) 



stant (-l) t "W/ 2 / % /2" 3T as 

)n-d(j,*)(_ iBinfl )«J(3.*). 



y (cos( 

keE„ 

For every even I, there are exactly f?) different k's such that 
d(j, k) = I. Therefore, the summation is 



y 



-n-l , 



-i smf 



= R °y (?) cos n - l 9(-ism0) 1 

i=o ^ ' 
= Ree- ine = cos(n6»). 

Taking the constant into account, we have 

(j\U® n \* n ) = [ J— co S (n9), 



(87) 



(88) 



which enables us to finish the calculation of Pr(euen) in 
Eq. (Hi as 



Pr(even) 



j£E„ 



(j\U® n \H> n )\= cos 2 (n6). 



(89) 



The parameter is thus amplified as promised with the help 
of the GHZ entangled state. We need only to follow the 
idea of the modified bitwise estimation protocol discussed in 
Section UlI-BI Clearly, the estimator is of order 0(log N/N). 
We do not know whether it is possible or not to achieve the 
order of 1/N in this problem as in the case of estimating 
unitary operations. 
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VI. Conclusions 

In this paper, we have discussed the estimation theory of 
parameters of quantum channels with emphasis on evaluating 
the efficiency of estimation protocols. It is clear now that 
there are two fundamentally different types of parameters of 
quantum channels, one of which can be estimated supereffi- 
ciently and the other cannot. The fact that all programmable 
parameters are inefficient provides us with an easy-to-use yet 
powerful way of determining whether superefficient estimation 
is possible. Based on this fact, we have shown many examples 
of inefficient parameters. What is more, it also follows that all 
parameters of classical information channels are inefficient and 
that depolarizing noise will undermine the efficiency univer- 
sally. We have also constructed the so-called projector-class 
channels and provide an superefficient protocol to estimate 
the parameter of this family. Thus superefficient estimation is 
not a unique phenomenon of unitary operations. What remains 
valuable for further investigation in future work is to evaluate 
the power of the "no-go" criterion in terms of programmability, 
and to characterize, in a more direct way, parameters of a 
quantum channel that can be superefficiently estimated. 
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